4,147 research outputs found

    Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Get PDF
    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.ImportanceTo fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available

    Physiology can predict animal activity, exploration, and dispersal

    Get PDF
    Physiology can underlie movement, including short-term activity, exploration of unfamiliar environments, and larger scale dispersal, and thereby influence species distributions in an environmentally sensitive manner. We conducted meta-analyses of the literature to establish, firstly, whether physiological traits underlie activity, exploration, and dispersal by individuals (88 studies), and secondly whether physiological characteristics differed between range core and edges of distributions (43 studies). We show that locomotor performance and metabolism influenced individual movement with varying levels of confidence. Range edges differed from cores in traits that may be associated with dispersal success, including metabolism, locomotor performance, corticosterone levels, and immunity, and differences increased with increasing time since separation. Physiological effects were particularly pronounced in birds and amphibians, but taxon-specific differences may reflect biased sampling in the literature, which also focussed primarily on North America, Europe, and Australia. Hence, physiology can influence movement, but undersampling and bias currently limits general conclusions

    A Comprehensive Biophysical Description of Pairwise Epistasis throughout an Entire Protein Domain

    Get PDF
    SummaryBackgroundNonadditivity in fitness effects from two or more mutations, termed epistasis, can result in compensation of deleterious mutations or negation of beneficial mutations. Recent evidence shows the importance of epistasis in individual evolutionary pathways. However, an unresolved question in molecular evolution is how often and how significantly fitness effects change in alternative genetic backgrounds.ResultsTo answer this question, we quantified the effects of all single mutations and double mutations between all positions in the IgG-binding domain of protein G (GB1). By observing the first two steps of all possible evolutionary pathways using this fitness profile, we were able to characterize the extent and magnitude of pairwise epistasis throughout an entire protein molecule. Furthermore, we developed a novel approach to quantitatively determine the effects of single mutations on structural stability (ΔΔGU). This enabled determination of the importance of stability effects in functional epistasis.ConclusionsOur results illustrate common biophysical mechanisms for occurrences of positive and negative epistasis. Our results show pervasive positive epistasis within a conformationally dynamic network of residues. The stability analysis shows that significant negative epistasis, which is more common than positive epistasis, mostly occurs between combinations of destabilizing mutations. Furthermore, we show that although significant positive epistasis is rare, many deleterious mutations are beneficial in at least one alternative mutational background. The distribution of conditionally beneficial mutations throughout the domain demonstrates that the functional portion of sequence space can be significantly expanded by epistasis

    A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing

    Get PDF
    Figure S1. Sequence properties of protein G. (a) The sequence of 88 bp template was shown in DRuMS color schemes. The overlapping region of target sequence and forward primer or reverse primer was shown. (b) The A-T C-G density plot along the target sequence. Matlab nucleotide sequence analysis toolbox was used to plot this figure. (EPS 498 kb

    Accurate Viral Population Assembly From Ultra-Deep Sequencing Data

    Get PDF
    Motivation: Next-generation sequencing technologies sequence viruses with ultra-deep coverage, thus promising to revolutionize our understanding of the underlying diversity of viral populations. While the sequencing coverage is high enough that even rare viral variants are sequenced, the presence of sequencing errors makes it difficult to distinguish between rare variants and sequencing errors. Results: In this article, we present a method to overcome the limitations of sequencing technologies and assemble a diverse viral population that allows for the detection of previously undiscovered rare variants. The proposed method consists of a high-fidelity sequencing protocol and an accurate viral population assembly method, referred to as Viral Genome Assembler (VGA). The proposed protocol is able to eliminate sequencing errors by using individual barcodes attached to the sequencing fragments. Highly accurate data in combination with deep coverage allow VGA to assemble rare variants. VGA uses an expectation–maximization algorithm to estimate abundances of the assembled viral variants in the population. Results on both synthetic and real datasets show that our method is able to accurately assemble an HIV viral population and detect rare variants previously undetectable due to sequencing errors. VGA outperforms state-of-the-art methods for genome-wide viral assembly. Furthermore, our method is the first viral assembly method that scales to millions of sequencing reads

    A comprehensive functional map of the hepatitis C virus genome provides a resource for probing viral proteins.

    Get PDF
    UnlabelledPairing high-throughput sequencing technologies with high-throughput mutagenesis enables genome-wide investigations of pathogenic organisms. Knowledge of the specific functions of protein domains encoded by the genome of the hepatitis C virus (HCV), a major human pathogen that contributes to liver disease worldwide, remains limited to insight from small-scale studies. To enhance the capabilities of HCV researchers, we have obtained a high-resolution functional map of the entire viral genome by combining transposon-based insertional mutagenesis with next-generation sequencing. We generated a library of 8,398 mutagenized HCV clones, each containing one 15-nucleotide sequence inserted at a unique genomic position. We passaged this library in hepatic cells, recovered virus pools, and simultaneously assayed the abundance of mutant viruses in each pool by next-generation sequencing. To illustrate the validity of the functional profile, we compared the genetic footprints of viral proteins with previously solved protein structures. Moreover, we show the utility of these genetic footprints in the identification of candidate regions for epitope tag insertion. In a second application, we screened the genetic footprints for phenotypes that reflected defects in later steps of the viral life cycle. We confirmed that viruses with insertions in a region of the nonstructural protein NS4B had a defect in infectivity while maintaining genome replication. Overall, our genome-wide HCV mutant library and the genetic footprints obtained by high-resolution profiling represent valuable new resources for the research community that can direct the attention of investigators toward unidentified roles of individual protein domains.ImportanceOur insertional mutagenesis library provides a resource that illustrates the effects of relatively small insertions on local protein structure and HCV viability. We have also generated complementary resources, including a website (http://hangfei.bol.ucla.edu) and a panel of epitope-tagged mutant viruses that should enhance the research capabilities of investigators studying HCV. Researchers can now detect epitope-tagged viral proteins by established antibodies, which will allow biochemical studies of HCV proteins for which antibodies are not readily available. Furthermore, researchers can now quickly look up genotype-phenotype relationships and base further mechanistic studies on the residue-by-residue information from the functional profile. More broadly, this approach offers a general strategy for the systematic functional characterization of viruses on the genome scale

    The Properties of Radio Galaxies and the Effect of Environment in Large Scale Structures at z∼1z\sim1

    Get PDF
    In this study we investigate 89 radio galaxies that are spectroscopically-confirmed to be members of five large scale structures in the redshift range of 0.65≤z≤0.960.65 \le z \le 0.96. Based on a two-stage classification scheme, the radio galaxies are classified into three sub-classes: active galactic nucleus (AGN), hybrid, and star-forming galaxy (SFG). We study the properties of the three radio sub-classes and their global and local environmental preferences. We find AGN hosts are the most massive population and exhibit quiescence in their star-formation activity. The SFG population has a comparable stellar mass to those hosting a radio AGN but are unequivocally powered by star formation. Hybrids, though selected as an intermediate population in our classification scheme, were found in almost all analyses to be a unique type of radio galaxies rather than a mixture of AGN and SFGs. They are dominated by a high-excitation radio galaxy (HERG) population. We discuss environmental effects and scenarios for each sub-class. AGN tend to be preferentially located in locally dense environments and in the cores of clusters/groups, with these preferences persisting when comparing to galaxies of similar colour and stellar mass, suggesting that their activity may be ignited in the cluster/group virialized core regions. Conversely, SFGs exhibit a strong preference for intermediate-density global environments, suggesting that dusty starbursting activity in LSSs is largely driven by galaxy-galaxy interactions and merging.Comment: 28 pages, 10 figures, accepted to MNRA

    Chromosomal DNA deletion confers phage resistance to Pseudomonas aeruginosa.

    Get PDF
    Bacteria develop a broad range of phage resistance mechanisms, such as prevention of phage adsorption and CRISPR/Cas system, to survive phage predation. In this study, Pseudomonas aeruginosa PA1 strain was infected with lytic phage PaP1, and phage-resistant mutants were selected. A high percentage (~30%) of these mutants displayed red pigmentation phenotype (Red mutant). Through comparative genomic analysis, one Red mutant PA1r was found to have a 219.6 kb genomic fragment deletion, which contains two key genes hmgA and galU related to the observed phenotypes. Deletion of hmgA resulted in the accumulation of a red compound homogentisic acid; while A galU mutant is devoid of O-antigen, which is required for phage adsorption. Intriguingly, while the loss of galU conferred phage resistance, it significantly attenuated PA1r in a mouse infection experiment. Our study revealed a novel phage resistance mechanism via chromosomal DNA deletion in P. aeruginosa
    • …
    corecore